A Note on Modeling Retweet Cascades on Twitter
نویسندگان
چکیده
Information cascades on social networks, such as retweet cascades on Twitter, have been often viewed as an epidemiological process, with the associated notion of virality to capture popular cascades that spread across the network. The notion of structural virality (or average path length) has been posited as a measure of global spread. In this paper, we argue that this simple epidemiological view, though analytically compelling, is not the entire story. We first show empirically that the classical SIR diffusion process on the Twitter graph, even with the best possible distribution of infectiousness parameter, cannot explain the nature of observed retweet cascades on Twitter. More specifically, rather than spreading further from the source as the SIR model would predict, many cascades that have several retweets from direct followers, die out quickly beyond that. We show that our empirical observations can be reconciled if we take interests of users and tweets into account. In particular, we consider a model where users have multi-dimensional interests, and connect to other users based on similarity in interests. Tweets are correspondingly labeled with interests, and propagate only in the subgraph of interested users via the SIR process. In this model, interests can be either narrow or broad, with the narrowest interest corresponding to a star graph on the interested users, with the root being the source of the tweet, and the broadest interest spanning the whole graph. We show that if tweets are generated using such a mix of interests, coupled with a varying infectiousness parameter, then we can qualitatively explain our observation that cascades die out much more quickly than is predicted by the SIR model. In the same breath, this model also explains how cascades can have large size, but low “structural virality” or average path length.
منابع مشابه
Modeling a Retweet Network via an Adaptive Bayesian Approach
Twitter (and similar microblogging services) has become a central nexus for discussion of the topics of the day. Twitter data contains rich content and structured information on users’ topics of interest and behavior patterns. Correctly analyzing and modeling Twitter data enables the prediction of the user behavior and preference in a variety of practical applications, such as tweet recommendat...
متن کاملTiDeH: Time-Dependent Hawkes Process for Predicting Retweet Dynamics
Online social networking services allow their users to post content in the form of text, images or videos. The main mechanism driving content diffusion is the possibility for users to re-share the content posted by their social connections, which may then cascade across the system. A fundamental problem when studying information cascades is the possibility to develop sound mathematical models, ...
متن کاملUser's Action and Decision Making of Retweet Messages towards Reducing Misinformation Spread during Disaster
The online social media such as Facebook, Twitter and YouTube has been used extensively during disaster and emergency situation. Despite the advantages offered by these services on supplying information in vague situation by citizen, we raised the issue of spreading misinformation on Twitter by using retweets. Accordingly, in this study, we conduct a user survey (n = 133) to investigate what is...
متن کاملDetection and Characterization of Influential Cross-lingual Information Diffusion on Social Networks
Social network services (SNSs) have become new global and multilingual information platforms due to their popularity. In SNSs with content-sharing functionality, such as“retweet” in Twitter and “share” in Facebook, posts are easily and quickly shared among users, and some of which can spread over different regions and languages. In this work, we first define the concept of cross-lingual informa...
متن کاملExploration Study of Retweet Propensity
Twitter, a social media platform, can be used to spread information across social networks. A critical aspect of this spread of information is user’s engagment within such environments, specifically the level that a user will ’retweet’. This paper describes three experiments on observational data collected on Twitter. First, this paper will explore the possible causal structure that can be foun...
متن کامل